An Algorithm for Suffix Sorting and Its Applications∗
نویسندگان
چکیده
The suffix tree is a data structure that has found applications in various important problems, such as genetic sequencing, pattern matching and computational biology. Its derivative data structure, the suffix array, is another representation with the added advantage of a small memory footprint. We propose a simple O(n log n) time divideand-conquer sort-and-merge algorithm for solving the suffix sorting problem. Given the suffix array, the array of Longest Common Prefix (LCP) can be constructed in O(n) time. Our proposed algorithm distinguishes itself from existing suffix array algorithms by the use of a relatively simple partitioning scheme at the division stage. We discuss applications suffix sorting to different problems in computational biology.
منابع مشابه
Parallel Suffix Sorting
We present a parallel algorithm for lexicographically sorting the suffixes of a string. Suffix sorting has applications in string processing, data compression and computational biology. The ordered list of suffixes of a string stored in an array is known as Suffix Array, an important data structure in string processing and computational biology. Our focus is on deriving a practical implementati...
متن کاملLinear-Time Longest-Common-Prefix Computation in Suffix Arrays and Its Applications
We present a linear-time algorithm to compute the longest common prefix information in suffix arrays. As two applications of our algorithm, we show that our algorithm is crucial to the effective use of block-sorting compression, and we present a linear-time algorithm to simulate the bottom-up traversal of a suffix tree with a suffix array combined with the longest common prefix information.
متن کاملExposition and Analysis of a Suffix Sorting Algorithm
This paper focuses on the suffix sorting algorithm of Maniscalco [10], which at the time of writing is available only as C++ source code on the Internet. We will refer to the program as MSufSort. MSufSort computes the Inverse Suffix Array (ISA) of an input string, which is equivalent to computing the Suffix Array (converting one to the other is discussed in section 8). Recall that for i ∈ [0..n...
متن کاملIn-Place Suffix Sorting
Given string T = T [1, . . . , n], the suffix sorting problem is to lexicographically sort the suffixes T [i, . . . , n] for all i. This problem is central to the construction of suffix arrays and trees with many applications in string processing, computational biology and compression. A bottleneck in these applications is the amount of workspace needed to perform suffix sorting beyond the spac...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006